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ABSTRACT 

We present a general probabilistic formalism for cross-identifying astronomical point sources in multiple 
observations. Our Bayesian approach, symmetric in all observations, is the foundation of a unified framework 
for object matching, where not only spatial information, but physical properties, such as colors, redshift and 
luminosity, can also be considered in a natural way. We provide a practical recipe to implement an efficient 
recursive algorithm to evaluate the Bayes factor over a set of catalogs with known circular errors in positions. 
This new methodology is crucial for studies leveraging the synergy of today's multi-wavelength observations 
and to enter the time-domain science of the upcoming survey telescopes. 
Subject headings: astrometry — catalogs — galaxies: statistics — methods: statistical 
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1. MOTIVATION 

Observational astronomy has changed drammatically over 
the last decade. With the introduction of large-format, high- 
resolution detectors at all wavelengths of the electromagnetic 
spectrum, astronomers now face an avalanche of data pour- 
ing from the instruments of dedicated telescopes. While 
most imaging surveys today obtain multicolor information, 
no one telescope can cover the entire spectrum because the 
physics of the detectors is very different at different frequen- 
cies. To fully utilize the available observations, e.g., to boost 
the chances of discovering new kinds of sources, and under- 
standing the underlying physical relations of object properties 
in a statistical way, one needs to merge the datasets of various 
telescopes by federating the archives. The Virtual Observa- 
tory initiative spearheaded by the International Virtual Ob- 
servatory Alliance' (IVOA) is pursuing automated data ex- 
change protocols with catalog cross-identification, and the 
US National Virtual Obs ervatory^ (NVO) is building tools, 
e.g.. Open SkyQuery ( Budavari et al.l 120041) to facilitate a 
standard unified framework. The key step in the process is 
the cross-identification of the sources in multiple catalogs to 
link observations of one telescope to other's. Previous at- 
tempts to alleviate the proble m utilized likelihood analysis 
([Sutherland & Saunders^ '1992*) and machine learning (ML) 
techniques (Rohde et al. 2006) that addre ssed specific issues 
of the matching problem of two catalogs. iMann et aP (Il997h 
successfully applied the former likelihood ratio method to 
associate sources in the Infrared Space Observatory and the 
Hubble Deep Field catalogs, and the ML techniques were 
used to study the SuperCOSMOS observations and HI Parkes 
All Sky Survey. 

Today astronomers typically join two catalogs by setting 
some threshold on the angular separation of sources that is 
motivated by the astrometric accuracies of the datasets in- 
volved. When more than two catalogs are to be crossmatched, 
astronomers often hatch a chaining rule based on the implicit 
prior knowledge about the sources. For example, one might 
decide to match all lower-accuracy datasets to the best one, or 
to go from wavelength to wavelength, hoping that the sources 
do not change significantly over a shorter wavelength range. 

' http://www.ivoa.net 
^ http://us-vo.org 



The problem with these traditional ways is not that they are 
based on implicit assumptions and intuitions but that they are 
not symmetric. While the pairwise matches might be accept- 
able, there is no guarantee, or any measure of quality, that the 
elected final matches are plausible or if the list is complete. 
After all picking a different order of pairwise matching would 
yield a different catalog. 

We need algorithms that are symmetric in the catalogs and 
provide a reliable measure of quality that one can use to ex- 
clude or down weight unlikely combinations of sources. We 
need a unified framework, where on top of the spatial infor- 
mation, other measurements can also be incorporated along 
with explicit models and physical priors. In Section|2]we dis- 
cuss the Bayesian approach to address these issues, and in 
Section[3]the spherical normal distribution is studied. In Sec- 
tions |4] we demonstrate how to include other observational 
evidence such as from multicolor photometric measurements. 
Section |5] focuses on the effects of a limited field of view on 
the observational evidence, the prior and posterior probabil- 
ities. In Section |6] an efficient implementation of the frame- 
work is described in the detail, and Section [7] concludes our 
study. 

Throughout the paper we follow the usual convention of us- 
ing the lower-case p symbol for representing probability den- 
sity functions and the capital P symbol for probabilities. 

2. OBSERVATIONAL EVIDENCE 

Often Bayesian analysis is refered to as the calculus of be- 
lief, however, it should rather be thought of as the calculus 
of observational evidence. When presented with a series of 
observed positions, one would like to know whether they are 
truely from the same source. If the coordinates are scattered 
all over the celestial sphere, it seems very unlikely that they 
are measurements of the same astronomical object, but when 
the coordinates are only a tiny fraction of an arcsecond apart, 
we "know" that we found a good match. How good is that 
match? Or what is the evidence that it is a match? 

2. L Modelling the Astrometry 

First let us examine what astrometric precision means. In 
the process of calibrating the positions in a catalog of ex- 
tracted sources, one can characterize the properties of the 
observations by comparing the positions to astrometric stan- 
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dards, and even correct for systematic offsets. Yet, there re- 
mains a random scatter around the true positions. This uncer- 
tainty is often modelled as a normal distribution, and catalogs 
would quote a single cr-value for their accuracy, e.g., a = 0.1 
arcseconds. In general, our understanding of the astrometry 
is described by a probability density function (PDF) that may 
even vary on the sky. We parameterize our model M that the 
object is on the celestial sphere using a three-dimensional nor- 
mal vector m, and write p(x\m,M) for the probability density 
that an object at its true location m is observed at a position x. 
As any PDF, this function is normalized. 



p{x\m,M)d\ = 1 



(1) 



Now we take a single source observed at ii and apply Bayes' 
theorem to find the posterior density of the true location m 
given the obtained data. 



p(m\xuM) = 



p{xi\m,M)p(m\M) 
pixi\M) 



(2) 



where the trivial prior p(m\M) of m being on the celestial 
sphere is expressed with Dirac's ^-symbol. 



p(m\M)=^Si\m\-l) 
Ait 



(3) 



and the normalizing constant guarantees the law of total prob 
ability, 

pix\\M)= / p{m\M)p{x\\m,M)(f'm 



(4) 



Another interesting direct application is the calculation of the 
chance that we find a visible object at position m in a given 
footprint. If the angular window function is 17, this probability 
is simply 

P{VL\m,M) = I p(j^m,M)d^x (5) 

which one can use to infer the PDF on the true position by 
applying Bayes' rule 



pim\Mn) = p{m\Vt,M)-- 



p(m\M)P(n\m,M) 
p(m\M)P{n\m,M)d\ 



(6) 



This is our best understanding of where an object might be 
on the sky (prior to measuring its actual position) that is seen 
in the specified Q, footprint assuming astronometric precision 
p(j^m,M) derived from the calibration. 

2.2. The Bayes Factor 

With multiple observations through various instruments of 
possibly different astrometric accuracies, we now turn to com- 
pute the evidence that all observations are from the same 
source. We introduce the Bayes factor to test this hypothesis 
H against the case when separate sources are possible, K. Af- 
ter the observation are obtained, D = {xi,X2, . . . ,x„} locations 
on the sky, we compute the ratio of the posterior and prior 
probabilities of each hypothesis. The Bayes factor is defined 
as the ratio of these odds. 



which, after applying Bayes' theorem, becomes 

p(D\H) 



B(H,K\D)-- 



p(D\K) 



(7) 



(8) 



for continuous observables. The actual calculation is done 
by parameterizing the two models H and K, and integrating 
the likelihood functions for the entire configuration space. 
Our hypothesis H says that the positions are from a single 
source, thus can be parameterized by a single common loca- 
tion m. Due to the independence of the measurements in D, 
the joint PDF is just the product of the astrometric precisions 
pi,P2,--- ,Pn, and the integral simplifies to 

f " 

p(D\H)= p(m\H)\[pi(xi\m,H)d^m (9) 

On the other hand, the alternative hypothesis K is parame- 
terized by separate {m, } positions, and the integral factorizes 
into the product of the independent components 

p{D\K) = n p{m\K)pi(xi\m,K)d^m)^ (10) 

When the Bayes factor is large, the observations support the 
hypothesis that the association is a match, if it is in the order 
of unity, the evidence is not convincing, and finally if the ratio 
is less than one, the data prefers the alternative hypothesis. 

3. THE NORMAL DISTRIBUTION 

Normal distributions emerge often in nature, where a num- 
ber of effects play roles in shaping up the probability density, 
cf. the Central Limit theorem. Although many of the usual ar- 
guments do not hold over closed topological manifolds, e.g., 
the Centr al Lirnit th eorem leads to isotropic distribution on 
the circle (lLevvl ll939), it is possible to introduce an analogue 
to the normal distrib ution function on the sphere (l Fisherll953l: 
iBreitenbergeJ 1 963h . The spherical normal distribution is of- 
ten elected to characterize the precision of astronomy obser- 
vations, hence it is of great importance to understand its prop- 
erties, and to apply the Bayesian framework described in the 
previous section. 

The spherical normal distribution in its normalized form us- 
ing the previous 3-D vector notation is written as 



N(x\m,w)= — 7—^ — exp(wmi) 



(11) 



Att sinh w 

where the weight w is typically very large. When this is the 
case, the weight is related to the more intuitive precision pa- 
rameter a by the equation 

w= 1/0-2 (^2) 

For example, when a is in the order of an arcsecond, the 
weight takes values of ~ 10"^. Having observed a set of 
positions independently with corresponding weights, we can 
compute the Bayes factor for the two hypotheses H and K 
introduced earlier Because the function N{x\m,w)p{m\M) is 
symmetric in x and m for the trivial prior, and the PDFs are 
normalized, the Bayes factor is computed analytically, and be- 



sinhw 1 r 
B{H,KD)= TT 



Wi 



1=1 



sinhw; 



with 



w = 



E 



WiXi 



(13) 



(14) 



where we exploit the fact that the product of normal distribu- 
tions has the same functional form. A detailed derivation is 
given in AppendixlAl 
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In case of only two observations, this weight depends on the 
astrometric precisions and the angle tp between the positions 

w= \J W[ +W2 + 2W1W2COS V' (15) 

For the typical large weights and small angular separations 
between the measurements, we get 

In Figure [T] the 10-based logarithm of the Bayes factor, also 
known as the weight of evidence, is shown as a function of 
angular separation for the three cases of matching two cata- 
logs of (Ti = 0.1" and (72 = 0.5" to each other and to them- 
selves. This is the problem of matching the Sloan Digital 
Sky Survey (SPSS: lYork et al.1 HOOO: Pieretal. 2003) and 
the Galaxy Evolution Explorer (GALEX; .Martin et al..,2005i: 
iMorrissev et al.ll2"007h science archives. 



TABLE 1 

Weight of evidence for three surveys as a function of the 
angular separations in (t = (ti = (t2 = (73 = 0.1" units 
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Fig. 1. — The weight of evidence as a function of angular separa- 
tion for the three cases of matching two catalogs of cr[ = 0.1" and (T2 = 
0.5" to each other and to themselves. For example, matching SDSS 
and GALEX sources: SDSS-SDSS (sohd), SDSS-GALEX (dotted) and 
GALEX-GALEX (dashed). 

Matching three catalogs also makes an interesting case 
study for the various potential configurations of three posi- 
tions. The Bayes factor for this case, in the same limit as 
previously, takes the form of 



B = 



9 9 9 9 9 9 



(17) 



In Table [T] the weight of evidence is shown for various con- 
figurations from the matching of three similar catalogs with 
equal astrometric accuracies, ai = a2 = (t^ = 0.1" (separations 
listed in a units.) The astrometric precision was chosen to 
match the nominal SDSS limitations. 

In general, the Bayes factor for the typical large weights 
and small angular separations takes the form of 



:2''-'fl^exp. 



2Ew,- 



(18) 



where all summations and products run on the members of the 
tuple from the n number of catalogs; see Appendix IB] for the 
details of the calculation. 



In scenarios where individual errors are different or even 
anisotropic, one can generalize our expression in a fairly 
straightforward manner in the above approximation. Instead 
of the scalar weight, one can use the inverse of the covariance 
matrix, however, the elegant simplicity of the expressions is 
sacrificed. 

4. FOLDING IN THE PHYSICS 

Naturally the formalism introduced in Section|2]is not spe- 
cific to astrometric observations. In fact, it is rather straight- 
forward to fold other measured quantities into the calcula- 
tions. This is especially important when dealing with mul- 
tiple matches. Picking the "correct" combination of sources 
from various spatially similar configurations is a degenerate 
problem that requires extra information to resolve. The use of 
photometric information is a natural choice for its wide avail- 
ability, however, its application requires further assumptions 
on the spectral energy distributions (SEDs). Often models ex- 
ist to help out with the solution, but extra caution is needed 
to avoid any undesirable effect. For example, when the goal 
is to discover new types of objects with unknown SEDs, one 
should not apply known SEDs as priors but rather look for 
combinations that are likely matches based on spatial detec- 
tions but excluded by SED modelling. 

As a demonstration of these ideas, let us apply the intro- 
duced Bayesian framework to photometric measurements in 
various passbands. The ingredients include the following fur- 
ther explicit models: 

1 Model S for the spectrum energy distributions, e.g., by 
iBruzual & Chariot (l2003h . described by a set of param- 
eters, ff: s(\\ff,S) along with the corresponding piff\S) 
priors; 

2 Model R for the transmission of the passbands to calcu- 
late simulated fluxes 'y(f]\S,R) by integrating the SEDs 
.s(A|rf, 5) with the appropriate response functions; and 

3 Model C for the uncertainty of the catalog from the pho- 
tometric calibration, p(g\j,C), where gis the observed 
flux set and 7 is the true. 

These separate models can be folded into a single model M, 
for simplicity, so one can write p{g\iJ,M) for the probability 
density of measuring g fluxes for an object with ff physical 
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properties of S seen through the filters in R with the C photo- 
metric accuracy. The Bayes factor for the photometry in the 
face of the observed fluxes D' = {^1,^2, ■■■ ,gn}, similarly to 
the astrometric formulas, is given by the ratio 



p(rj\H)Y[pi(gi\rf,H)d'-', 



B(H,K\D') = 



Yl \ / p{fji\K)piigi%,K)d% 



(19) 



In the simplest case, S is parameterized by a discrete spec- 
tral type r, the redshift z and an overall scaling factor for the 
brightness, a: 

l = af{T,z) (20) 

where / is a vector of the simulated photometry in the various 
passbands. Photometric uncertainties are often assumed to be 
Gaussian with a diagonal covariance matrix of elements aj, 
where I runs on the L number of passbands. After substitution, 
we arrive at the familiar formula of 



p(gl?7,M)= -^exp. 



-E 



{gi-af,{T,z)Y 
2af 



(21) 



where constant M is the usual normalization factor of the 
multivariate normal distribution, which in our special case is 
just M = {2'K)^l^ai(j2 • • • ctl- Integrating these models to get 
the Bayes factor is a very similar problem to template fitting 
photometric redshift estimation. In fact, the two procedures 
can be done in a self-consistent way within the same appli- 
cation. Naturally, spectroscopic redshift measurements can 
be directly incorporated in this analysis, when available, but 
other data can also enter in a straightforward manner. 

The Bayesian analysis is inherently recursive. As soon 
as we obtain new measurements, and compute the posterior 
probability, that becomes the prior for subsequent studies. 
This is an extremely powerful property, and simplifies the 
computations enormously. A consequence of this is that the 
combined Bayes factor of the astrometric and photometric 
measurements is simply the product of the two, 

B=Bpos-Bphot (22) 

as also seen from the Bayes factor's definition. This means 
that one can just do the spatial join first, and consider addi- 
tional measurements and physical priors in subsequent steps, 
if needed. 

5. FROM PRIORS TO POSTERIORS 

The Bayes factor naturally relates the prior and posterior 
probabilities. When K is the complementary hypothesis of H, 
the posterior probability is 



P(H\D) = 



1 + 



l-P{H) 



BP{H) 



(23) 



(24) 



which, in the limit of vanishing priors, becomes 

BP(H) 
P{H\D)= -— 

To make a definitive decision on whether a set of detections 
should be considered a match, one would like to set a limit on 
the posterior probability and derive the Bayes factor thresh- 
old from that, however, this can only be done with an initial 
estimate of the prior. 



5.1. The Prior and the Selection Function 

The prior probability depends on the angular and radial se- 
lection functions of the observations. If the visible universe 
contains objects, and we select two of them at random, the 
probability of picking the same object is \/N. When selecting 
n objects, the probability is 1/N"~\ A limited field of view 
shrinks the observable volume, hence decreases the number 
of objects, and increases the prior probability. When the an- 
gular selection functions of the catalogs overlap only partially 
then one can just consider the intersection of the sky coverage 
and the smaller number of sources within. 

The various radial selection functions also have a signifi- 
cant role, and make the situation more complicated. In order 
to consider their effect, one has to estimate the overlap of the 
selections in the input catalogs. Every catalog has observa- 
tional constraints, other than the field of view, like flux limits, 
that set the radial selection function. The superset of these 
contraints defines the restrictions on the overlap catalog. Let 
A'* denote the number of objects in that catalog. In this gen- 
eral case, the prior probability takes the form of 



PiH) = NjY[Ni 



(25) 



When the limitations are identical, all catalogs have equal 
number of objects, Ni, = Ni = . . . = N„, and we get back the 
same formula of P(H) = 1 /N"~^ as before, but when, for ex- 
ample, one catalog consists of only low -redshift galaxies (e.g., 
z < 0.2), and the other has high-redshift quasars (e.g., z > 3), 
there is no overlap between the two radial selection functions, 
hence A^* = 0, which means P{H) = 0. One can get vanish- 
ing priors even if the redshift histograms overlap significantly, 
e.g., two catalogs of red (e.g., u-g < 2) and blue galaxies 
(u-g > 2). In general, all these complex selection effects are 
captured in a single scalar quantity. A'*, which is estimated 
based on prior physical knowledge, e.g., by using template 
SEDs and the known characteristics of the input catalogs (e.g., 
the luminosity functions), or alternatively, when no prior in- 
formation is available, one can invoke self-consistency argu- 
ments to derive it; see later. We now rewrite the prior with the 
surface densities, i' =N/fl, or the scaled number of objects 
for the entire sky, p = Attv, as 



P{H) = 



47r 



(26) 



This formula also provides a straightforward way to include a 
model for varying surface density on the sky, e.g., for stars, 
where ly = v{x). In this case, a constant limiting posterior 
probability yields a varying threshold on the Bayes factor as a 
function of the position on the celestial sphere. 

5.2. The Bayes Factor and the Window Function 

The field of view not only changes the prior probabilities 
but also modifies the Bayes factor When the window func- 
tion is known, one can refine the prior probability density that 
enters the integral of the numerator and denominator of the 
Bayes factor This is done by adopting eq. |6] as the prior. In 
first order, for the typical catalogs with large weights (high ac- 
curacy) and large contiguous observation areas, this new prior 
is uniform over the window function, neglecting the fuzzy 
boundary, except scaled by the area coverage 



p{rh\Mn)=^5{\m\ 



1) 



(27) 
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where again il(m) is the window function that takes the value 
1 when m is inside and otherwise, and is its area. The 
Bayes factor inside the footprint will be essentially same as 
before in eq.[T3]but also scaled with these fractional coverage: 



4-n 



sinhw 



n 



sinhvv, 



(28) 



The edge effect modifies this only for a tiny fraction of the ob- 
jects at the boundary of the observations. The proper integral 
is, of course, much more expensive than this analytical for- 
mula, but can be evaluated or re-evaluated, e.g., by an MCMC 
algorithm. 

For the typical small priors, the posterior depends only on 
the product of the Bayes factor and the prior; see eq. |24] 
This means that the footprint effect cancels out in the pos- 
terior probability; cf. eqs. I26landl28] Hence it is still sensi- 
ble to just simply use the all-sky formula in eq.[T3]and[T8]as 
long as the prior is written accordingly, i.e., p*/ p\P2 - ■ ■ Pn- 
From the data providers' point of view, who often do not 
know the field of view, e.g., the legacy catalogs in VizieR 
dOchsenbein et a l. 2000) or the N ASA/IPAC Extragalactic 
Database ("NED: Madore et al.l 1992 1 the best quantity to pub- 
lish along with the matched tuples is also the analytic all-sky 
Bayes factor, so researchers can incorporate their own prior 
knowledge, and set the thresholds on the posterior accord- 
ingly that are often specific to the science apphcation. 

5.3. Self-Consistent Estimation 

In principle, the cross-identification process is now com- 
plete, one just has to formulate the prior, possibly varying on 
the sky, and set a threshold on the posterior probabilities to 
select the matches. For the ignorant without a priori knowl- 
edge, these are not completely independent choices, and, at 
least in the limit when all observables are being considered 
in the Bayes factors, could be derived from requirements of 
a self-consistent field theory. When prior knowledge is avail- 
able and dictates a preference, one could and probably should 
still check for the consistency outlined here to understand the 
discrepancies, if any. 

The formula for the prior in eq. |25]is in fact equivalent to 
stating that P{H) is constant and 



(29) 



where the summation runs over the direct product of all sets of 
sources in the n catalogs, i.e., all possible combinations of de- 
tections with A^iA^2 ■ ■ 'Nn contributions. The self-consistency 
argument requires that 



Y,P(H\D)=N, 



(30) 



which is an equation for Ni, that can be solved by, e.g., some 
iterative approximation method starting from an initial value 
of A^* = min{A^,}. Initial experiments support our expecta- 
tions that these procedures indeed converge very rapidly, only 
in a few iterations, and are insensitive to the matching limit 
once the Bayes factor is less than unity. For varying un- 
known priors one can use s ome s ky tesse lation schemes, su ch 
as HE ALPix jO orski et al] |2005h . Igloo (ICrittendenll2000h or 
HTM (ISzalav et al.ii2005i) . and estimate a piecewise constant 
prior (uniform in the cells) using the same methodology. Nat- 
urally other more sophisticated models can also be used in 
the same spirit, e.g., specific functional forms or smoothing to 
limit the gradient, as well as tapered windows when required. 



The threshold on the posterior, Pj, can also be estabhshed 
in a consistent way. Here the requirement is that 



P{H\D)>Pt 



(31) 



This is equivalent to applying a Bayes classifier. By chang- 
ing the right hand side of the above equation, it is possible to 
make the selection more restrictive or less depending on the 
scientific goal. In the case, where the prior changes on the sky 
and eq.|30]is solved in cells of some pixelization, one can still 
just use a single Pr limit obtained from the entire catalog by 
ensuring that the total number of objects are consistent. The 
counts in individual cells may not be perfectly recovered but, 
if the prior is right, there should be no significant trends. 

6. PRACTICAL CONSIDERATIONS 

The question remains how to evaluate the Bayes factor ef- 
ficiently for multiple catalogs without considering all possi- 
ble combinations of sources. Fast algorithms exist to match 
two sets of point sou r ces using an angula r separation limit 
(Budavari et al. 2003; Malik etal. 2003; Grav et al. '2004 
2006; Szalay et al. 2005; Nieto-Santisteban 2007). Ideally 
one would like to leverage the power of these two-way cross- 
match engines in a recursive manner, and get rid of unlikely 
combinations with small Bayes factors as early as possible. 

Matching two catalogs is straightforward; any Bayes factor 
limit corresponds to a single distance cut, and hence our ex- 
isting tools are adequate. To go from n number of catalogs 
to «+ 1, we need to make this process iterative, and prune the 
match list step-by-step. We do this by computing the overall 
Bayes factor in every step assuming that all other subsequent 
catalogs will contribute sources at the best possible position. 
This optimization problem may be expensive to solve in gen- 
eral, but can be analytically calculated in special cases, and 
for the spherical normal distribution the solution is evident: 
the center position of the mode is the correct choice. 

In fact, for the normal distribution one can do even bet- 
ter In every step, a new catalog is added to the current sub- 
matches. Since the product of normal distributions is still of 
the same functional form, one can compute the Bayes factor 
as a function of angular separation from that position, derive 
the limiting radius, and utilize a two-way crossmatch engine 
for joining the current fc-tuples with the new {k+ 1)* catalog 
using that threshold. For this we rewrite the logarithm of the 
Bayes factor in eq. [18] in the more convenient form of 



lnB = lnA^- 



1 " 

1=2 



ai 



with the newly introduced variables 



N = 2" 



a* = E Wi 

1=1 

Ai=Xi-Ci-i 



(32) 



(33) 



(34) 



(35) 



where q is the unit vector of the best position for the current 
A;-tuple of sub-match, 



1=1 



(36) 
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With these we compute the weight of evidence in a recursive 
manner The iteration starts by substituting ci =i*i. In the 
A:* step, the maximum search radius p^^+i is computed from 
eq.[32]to yield the Bayes factor threshold Bq by assuming op- 
timal matches from the subsequent catalogs with vanishing 
A? contributions. 



= 21n- 



k 

-E 



biAf with bk = 



Ok 



(37) 



We assign every source within that radius to each fc-tuple sub- 
match, and go to the next catalog. In general, the search radius 
will be different for every tuple for their different spatial con- 
figurations. When the two-way matching algorithm requires a 
fixed radius, one can take the maximum value in linear time, 
use that more generous search radius in the matching, and fil- 
ter the result set later, just before going to the next catalog. 
From catalog to catalog we propagate only the quantities that 
are necessary to calculate the weight of evidence. The recur- 
sion formulas are given by the following expressions: 



ak = ak-i+Wk 

fl/fc-l . 2 

qk = qk-\ + — wki^k 

at 



Q= 1 Q_i + — /\k 
ak 



Q-I + Aa: 

ak 



(38) 
(39) 

(40) 



This stepwise method for evaluating the weight of evidence 
not only provides an accurate match list that meets all our re- 
quirements enumerated in Section [T] e.g., symmetry in the 
catalogs, but also exhibits the performance of the current 
state-of-the-art two-way crossmatching tools. 

7. SUMMARY 

We presented a general probabilistic formalism for cross- 
identifying astronomical point sources. The framework is 



based on Bayesian hypothesis testing to decide whether a se- 
ries of observations truly belong to a single astronomical ob- 
ject. The expression we derived is completely general, sym- 
metric in all observations, and accommodates any model of 
the astrometric precision. We introduced the spherical normal 
distribution, and calculated the Bayes factor for the generic 
n-way matching problem both in the general case and in the 
typical limit of high precision and small angular separations. 
The cases of 2- and 3-way matching were studied in detail. 
We discussed an efficient evaluation strategy of the Bayes 
factor that leverages the power of existing high-performance 
two-way matching tools in a recursive manner, yet, it provides 
accurate measurements of the observational evidence that are 
independent of the order of the catalogs considered. While the 
normal distribution is the simplest to work with for its unique 
properties, other specific PDFs can be handled in the same 
spirit. Our technique provides a natural mechanism to include 
other observed properties. We demonstrated how multicolor 
survey data, even at different wavelengths, can be utilized in 
the matching process by invoking SED models. Morpholog- 
ical classification or redshift measurements, when available, 
will also increase the accuracy of the results. 

The beauty of our approach to the cross-identification prob- 
lem is that it completely separates the dependence on each 
parameter, while providing the opportunity to incorporate 
them in a fairly straightforward way. Including expert knowl- 
edge about the physics of the objects in the analysis is eas- 
ily achievable by adopting the right priors, and when such 
information is not available, self-consistency arguments can 
guide the process to a stable solution in a few iterations. With 
the pre-computed Bayes factors in the matched catalogs, as- 
tronomers can define custom thresholds to derive specialized 
crossmatch catalogs based on their own explicit assumptions. 
For example, using a database of the same set of associations, 
researchers can optimize for completeness of the galaxy pop- 
ulation, or even search for unusually red objects. 



APPENDIX 



THE BAYES FACTOR AND THE SPHERICAL NORMAL DISTRIBUTION 

In this appendix we discuss the mathematical calculation of the Bayes factor in the common case, when a spherical normal 
distribution is assumed for modelling the astrometric accuracy. In addition we also adopt an all-sky prior in this derivation. 
The Bayes factor is the ratio of the likelihoods, p(D\H) and p(D\K), where again D represents the observed positions, {xi}. 



B- 



p{D\H) 
p(D\K) 



(Al) 



We recall that hypothesis H is parameterized by a single position, m unit vector, and K is parameterized by a set of n position 
vectors, {m,}. The basic equations to start from are 



p{D\H)= cPm p(m\H) p(D\m,H) 



piD\K) = jcPm, jd^im . . .jd^m„ p{m \K)p(m2\K) . ..pim„\K) p(D\ {m,} ,K) 



where 



p(m\M) = 



S(\m\-l) 



4-Tr 



p({xi} \m,H) = TTA^(i;|;ii, w,) = TT — At expiwiXifn) 

47rsmhw/ 

p{{xi} I {mi} ,K) = Y\N(xi\mi,Wi) = TT — r\ exp(w,i;7i?,) 



(A2) 
(A3) 

(A4) 
(A5) 

(A6) 



Probabilistic Cross-Identification of Astronomical Sources 

First we focus on hypothesis H 



p{D\H)= / d'm — - — — exp(w;Xim) (A7) 

J 47r 47rsinhw/ 

(jj S(\m\-l) ^ \ 



introduce 



wx 



= Y,WiXi (A9) 



where i'is a unit vector, and write 



p^D\H)= ( TT!!i&l) ) / d^JMlRe^piwxm) (AlO) 
\ 47rsinhwi J J 4-tt 

f sinhw -fj Wid{\xi\-l)\ f . w6(\m\-l) 
- ' ' I I : / d m : exp(wxm) (All) 



w sinh Wi 4tt 

t 

The hkehhood of the alternative hypothesis K is calculated similarly 



\ w 47rsinhw, / / 47rsinhw 

\ i / 

sinhw-A- Wi 5{\x:\—\) 

J- J- "Sinn W: O-iT 



/:'(D /i:)= / t/V- -. -. — r-r exp(w,x/m,) (A13) 

-'■.-'-J 47r 47rsmhw,- 



Hence the Bayes factor is 

si 

R= - 

w sinhw; 



^^sinhvv"_M^ (A15) 



as also shown in eq. [13] 



HIGH ASTROMETRIC ACCURACY AND SMALL SEPARATIONS 



The astrometric precision of the actual observations is almost always extremely high in the absolute sense, so it is worth 
examining the approximation of the Bayes factor in this limit. We also assume small angular separations. In the chain of 
equations below we only use the "w" sign to signal new approximations. We start from the previous result 

^^sinhvvA^ (Bl) 
w sinh Wi 

i 

w " 

«2«-'-TT^ (B2) 

/ 

w 

= 2«-in^.^'^'te-0 (B4) 
w 

where we exploit the fact that all w weights are large, hence the sinh w is approximately ^ expw. We proceed by calculating 

2 ,.,2 

T2 (B5) 



w 



(B6) 



W ; COS ?/;; ; 

(B7) 



Ei + 2 J2i<j j cos tp, 
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E,->v? + 2E,.<;W/Wj(l-V',V2) 



,2 



After taking the square root of the above equation, we get 

w 



and 



From the above equations we also see that 



(B8) 
(B9) 
(BIO) 



1-5^^^^ (BID 



l.^^U-^ (B13) 

in this context to only keep the leading term. By substituting the above two equations to eq. IB4I we arrive at our generic small 
angle result shown in eq.[T8] The 2- and 3-way matching cases are straightforward specializations of the generic equation, where 
one substitutes w, = l/af to work out the simplified formulae. 
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